Robust pitch tracking for prosodic modeling in telephone speech

نویسندگان

Chao Wang

Stephanie Seneff

چکیده

In this paper, we introduce a pitch detection algorithm that is particularly robust for telephone speech and prosodic modeling. The algorithm uses a logarithmically sampled spectral representation of speech, similar to that in the subharmonic summation approach [2]. Constraints for log F0 and ∆ logF0 are combined in a dynamic programming search to find an optimum pitch track. The search algorithm is able to find a continuous pitch contour regardless of the voicing status, while a separate voicing decision module computes a probability of voicing per frame. We evaluated the algorithm using the Keele pitch extraction reference database [4] under both studio and telephone conditions. Our algorithm is very robust to channel degradation, and compares favorably to xwaves under telephone conditions. It also significantly outperforms xwaves when used for tone classification on a telephone quality Mandarin digit corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Prosody of Discourse Structure and Content in the Production of Persian EFL Learners

The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...

متن کامل

Modeling the prosody of hidden events for improved word recognition

We investigate a new approach for using speech prosody as a knowledge source for speech recognition. The idea is to penalize word hypotheses that are inconsistent with prosodic features such as duration and pitch. To model the interaction between words and prosody we modify the language model to represent hidden events such as sentence boundaries and various forms of disfluency, and combine wit...

متن کامل

Statistical prosodic modeling: from corpus design to parameter estimation

The increasing availability of carefully designed and collected speech corpora opens up new possibilities for the statistical estimation of formal multivariate prosodic models. At Apple Computer, statistical prosodic modeling exploits the Victoria corpus, recently created to broadly support ongoing speech synthesis research and development. This corpus is composed of five constituent parts, eac...

متن کامل

Improved generation of prosodic features in HMM-based Mandarin speech synthesis

The HMM-based Text-to-Speech System can produce high quality synthetic speech with flexible modeling of spectral and prosodic parameters. However, the prosodic features, like F0 and duration trajectories, generated by HMM-based speech synthesis are often excessively smoothed and lack prosodic variance. In HMM-based TTS durations are typically modeled statistically using state duration probabili...

متن کامل

Symbolic and Direct Sequential Modeling of Prosody for Classification of Speaking-Style and Nativeness

In this paper, we explore the differences between direct and symbolic sequential modeling of prosody. We use sequential models to characterize speech in two tasks, classifying speaking-style and distinguishing native from non-native speech. We explore the use of a spike-and-slab model to directly model pitch contour data. We find in both of these tasks that sequences of symbolic prosodic events...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Robust pitch tracking for prosodic modeling in telephone speech

نویسندگان

چکیده

منابع مشابه

The Prosody of Discourse Structure and Content in the Production of Persian EFL Learners

Modeling the prosody of hidden events for improved word recognition

Statistical prosodic modeling: from corpus design to parameter estimation

Improved generation of prosodic features in HMM-based Mandarin speech synthesis

Symbolic and Direct Sequential Modeling of Prosody for Classification of Speaking-Style and Nativeness

عنوان ژورنال:

اشتراک گذاری